Justifying the prediction of major soil nutrients levels (N, P, and K) in cabbage cultivation

In a recent paper by Sajindra et al. [1], the soil nutrient levels, specifically nitrogen, phosphorus, and potassium, in organic cabbage cultivation were predicted using a deep learning model. This model was designed with a total of four hidden layers, excluding the input and output layers, with each hidden layer meticulously crafted to contain ten nodes. The selection of the tangent sigmoid transfer function as the optimal activation function for the dataset was based on considerations such as the coefficient of correlation, mean squared error, and the accuracy of the predicted results. Throughout this study, the objective is to justify the tangent sigmoid transfer function and provide mathematical justification for the obtained results.• This paper presents the comprehensive methodology for the development of deep neural network for predict the soil nutrient levels.• Tangent Sigmoid transfer function usage is justified in predictions.• Methodology can be adapted to any similar real-world scenarios.

• This paper presents the comprehensive methodology for the development of deep neural network for predict the soil nutrient levels.• Tangent Sigmoid transfer function usage is justified in predictions.
• Methodology can be adapted to any similar real-world scenarios.

Specifications table
Subject area: Engineering More specific subject area: Machine Learning Name of your method: Deep Neural Network (DNN) Name and reference of original method: Sajindra, H., Abekoon, T., Jayakody, J. A. D. C. A., & Rathnayake, U. (2024).A novel deep learning model to predict the soil nutrient levels (N, P, and K) in cabbage cultivation.Smart Agricultural Technology, 7, 100,395.doi: 10.1016/j.atech.2023.100395Resource availability: The data can be requested by corresponding author only for research purposes.

Background
Sajindra et al. [ 1 ] have investigated the impact on the major nutrient contents of soil, specifically nitrogen (N), phosphorus (P), and potassium (K), in conjunction with the influence on the growth characteristics of cabbage plants, including plant height, number of leaves, and average leaf area.To identify the complex relationship between plant characteristics and soil nutrient content, a deep learning model was employed.Upon discerning the intricate relationships within the dataset, the Deep Neural Network (DNN) was used to predict soil major nutrient content with heightened accuracy [ 1 ].The DNN model, characterized by its architecture consisting of four hidden layers each with ten nodes, underwent training using the Levenberg-Marquardt optimization algorithm.The training process resulted in the attainment of optimal performance metrics, specifically manifested in the lowest Mean Squared Error (MSE) values and elevated correlation coefficients (r) values.Notably, the application of the Tan Sig (Hyperbolic Tangent Sigmoid) transfer function during the training procedure contributed to achieving these favorable outcomes [ 2 ].The utilization of tansig , known for its ability to model complex relationships, enhanced the DNN's capacity to capture intricate patterns in the data, thereby facilitating the attainment of superior performance metrics, which are essential indicators of the model's effectiveness in learning and generalization [ 3 , 4 ].

Dataset
The data for the model were acquired from three model agricultural farms of green coronet cabbage in Marassana, Nuwara Eliya in Central province, and Welimada in Uva province of Sri Lanka.Measurements were obtained systematically, commencing from the installation of germinated cabbage seeds and spanning 85 days, at 7-day intervals.Simultaneous measurements encompassed critical parameters including soil nutrient concentrations such as N, P, and K, (major soil nutrients) along with key plant growth indicators comprising Plant Height, Number of Leaves, and Average Leaf Area (refer to Table 1 for sample data).This comprehensive dataset, collected over a specific timeframe and at regular intervals, provides a robust foundation for a detailed analysis of the growth dynamics and nutrient interactions within the green coronet cabbage cultivation across different locations in the central hills of Sri Lanka [ 1 ].

Primary equation
As the cabbage plant grows, a decrease in major soil nutrients content was observed, concomitant with an increase in plant growth characteristics such as plant height, number of leaves, and average leaf area.These three crucial micronutrients simultaneously influence the enhancement of plant growth characteristics.Consequently, to discern the relationship between these micronutrients and plant growth characteristics, Eq. (1) was formulated [ 1 ].Accordingly, the major soil nutrients of the soil can be predicted based on the plant growth characteristics.
The DNN model and TanSig function A total of four hidden layers, each crafted with ten nodes, are included in the meticulously designed neural network architecture tailored specifically for this task, and the tansig transfer function was employed as the activation function. is the weights and  is the neuron's bias which are represented in Fig. 1 .Positioned between the input and output layers, these hidden layers play a crucial role in empowering the network to discern intricate patterns and relationships inherent in the dataset [ 1 , 5 ].
The input layer, strategically configured with four nodes, is dedicated to accommodating key input factors within a weekly basis dataset.These factors include the number of days, the height of the cabbage plant, the number of cabbage leaves, and the average area of cabbage leaves.The hyperbolic tangent sigmoid transfer function, often denoted as tansig , is a non-linear activation function frequently employed in neural networks.However, tanh may provide greater accuracy and is recommended for applications that necessitate the use of the hyperbolic tangent function.The tanh function is known for its nonlinear nature and produces output values that fall within the range of [− 1, 1] [ 6 , 7 ].Notably, the gradient of the tanh function tends to be sharper than that of the sigmoid function [ 7 ].The operational mechanism of ℎ activation function is diagrammatically illustrated in Fig. 2 , where the input features are denoted as x, output as  (  ) the weights as w, and the bias of the neuron as b.In addition to that, the activation function, denoted as f, is employed at the value of each neuron, determining whether the neuron is active or not [ 8 ].
The mathematical expression defining the ℎ function is given by Eq. ( 2) .
The tansig function is calculated as the ratio between the hyperbolic sine and hyperbolic cosine functions.These sigmoidal transfer functions find extensive utilization in the hidden layers of neural networks.Sigmoidal transfer functions, characterized by their Sshaped curve, contribute to the non-linear activation of neurons, enhancing the capacity of ANNs to capture intricate patterns and relationships within data [ 9 , 10 ].

The connection between the input and output of the 𝑡𝑎𝑛ℎ function
The ℎ activation function aligns with the criteria of being both nonlinear and differentiable [ 11 ].Its differentiability concerning the respective inputs metrics (  ) and outputs metrics (  ) can be expressed by Eq. (3) .

Calculation of r and MSE
The following section exhibits an example for the calculation of r and MSE for the developed model.Notations are explained for the calculation as the following.

Actual vs predicted soil nutrient values under TanSig function
In Fig. 3 , a detailed comparison is presented between the major soil nutrient values predicted by the DNN model and the actual values.This comparison reflects the accuracy of predictions made by the model using the tansig function.For this graph, a variety of randomly chosen plant growth details were included.The actual and predicted values for P and K are similar on most days, with only small differences observed between the actual and predicted values for N compared to the P and K lines.Overall, the tansig function has provided better prediction results.

Conclusions
This study has offered a comprehensive justification, both theoretically and mathematically, for the appropriateness of the tansig activation function.The sample calculations, followed by a series of additional calculations, have justified the selection of the tansig activation function by establishing a solid mathematical foundation for the r and MSE.Furthermore, it was clearly articulated that, in line with this justification, the tansig function is deemed the most suitable activation function for the dataset, harmonizing effectively with both actual and predicted soil nutrient values.

Limitations
The method is demonstrated exclusively for the Green Coronet cabbage variety, and the collected data may vary depending on the climatic factors and soil conditions of the cultivation environment.Therefore, the applicability of this model is limited to a global approach.In addition, the model can be further develop to understand the nutirent requirement for other vegitables with the relevant data.

S x -Standard deviation of predicted values x 1 -
Value of each predicted values x -Average of predicted values S y -Standard deviation of actual values y 1 -Value of each actual values ȳ -Average of actual values n -Number of observations k -Number of explanatory variables

Table 1
Sample data collected from organic cabbage cultivation in weekly basis.